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ABSTRACT 


This paper considers estimation of a finite population mean under two-phase sampling procedure involving two auxiliary 
variables with the assumption that population mean of the first (main) auxiliary variable is unknown whereas population 
mean of the second (additional) auxiliary variable is known accurately. This issue has been addressed by bringing out two 
generalized ratio-type estimators constituting two separate families/classes of estimators of course not necessarily disjoint. 
Some optimum properties of the proposed generalized estimators have been investigated and sufficient conditions for their 
superiority over the classical two-phase ratio estimator have been reported. After identifying some ratio/ratio-type 
estimators as specific cases of the said generalized estimators, both analytical and empirical comparisons among various 


estimators have been undertaken to show the effectiveness of the proposed estimation technique. 


KEYWORDS: Auxiliary Variable, Ratio Estimator, Two-Phase Sampling 


Article History 
Received: 06Aug 2022 | Revised: 09Aug 2022 | Accepted: 19 Aug 2022 


INTRODUCTION 


Let the study variable y and an auxiliary variable x be defined on a finite population U of N units with (y;,%;),i = 


1,2,....,.N as their observed values on the ith unit. When the correlation coefficient between the two variables has a high 
positive value and no prior information is available on the population mean X = — yi  Xi then one of the most 
advantageous estimation strategy for the population mean Y = = N_,y; is the classical ratio estimator in conjunction with 
two-phase or double sampling. Here, for our purpose, let the two-phase sampling methodology be described in the 


following manner: 


e In the first phase, a large initial sample (called first phase sample) s,(s,; C U) of n,units is taken from the 


population by simple random sampling without replacement (SRSWOR) to obtain an acceptable estimate of X by 


measuring the values of x for all the n,sampled units. 


e In the second phase, a sub-sample (called second phase sample) s2(sz € s,) of n,units is again selected froms, 


by SRSWOR to measure the main characteristic under study y for each of these nzunits. 
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= Sint 1 
Let xX, = 5, es, xjbe the sample mean of x based on the first phase sample of n, units; y, = my les y; and 


X2= 5 bates, x;be the sample means of y and x respectively based on the second phase sample of nz units. Then the two- 


phase sampling classical ratio estimator for Yis defined by tz = Y, _ 
2 


Although tp is biased, for large sample sizes the bias is usually negligible and the approximate expression for the 


mean square error (MSE) is given by 


—2 
M(tr) =Y¥ [0265 + (@2 — 0,)(C? — 2C,x)], (1) 
where 0, = ——-,0, =—-=,C?2 =s2/¥ CG? = 52/x and C,,/Y X such that 

1 ny N° 2 no nN’ -Y y 94x x yx 


aa a: 
SZ = OF = Y) / (N - 1),82 = No(s - Xx) /(N — 1) as the population variances of y and x, 
Sy = Thai - Y) (x; = Xx) yf (N — 1)as the population covariance between y and x. 


Improvements over tp is also attainable either by redesigning the sampling scheme or by reshaping the estimator 
to bring considerable variance reduction compared to tg. But, another course of action for achieving this is the involvement 
of one or more additional auxiliary variables. In this work, we apply some modification techniques to tg with the aid of an 


additional auxiliary variable z to build up some new ratio-type estimators. 


2. ASSOCIATION OF A SECOND AUXILIARY VARIABLE 


Following Chand (1975) and Kiregyera (1980, 1984), let us consider a real life situation where information on X is lacking 
before the start of a survey operation whereas the values of a secondary auxiliary variables z are known for the entire finite 
population and the population mean Z is known accurately. Here it is also expected that, likex; z is highly correlated with 
y. For instance, let us refer to a survey conducted for the estimation of cattle population of a backward district with villages 
as the sampling units, andy;, x; and z;are respectively as the cattle population, total area under grass lands and geographical 
area of the ith village. In this case, we may not get the value of x; but the value of z; may be known from the district 


records and accordingly the exact value of Z can be calculated. 


The two-phase sampling mechanism in the present context is such that the preliminary sample s,is used to collect 
measurements on (x,z) whereas the second phase sample sz is used to collect measurements on y only. The key idea 
behind this is to make reasonable estimates for X based on the measured values of(x;, Z;), i€s,. Of course, the precision of 


such an estimate is influenced by the correlation strength between x and z. 


Estimation of Y under the above framework was fingered for the first time by Chand (1975), Sukhatme and 
Chand (1977) and subsequently studied by Kiregyera (1980, 1984) in greater detail. In due course of time, several authors 
inspired by Chand (1975) and Kiregyera (1980, 1984) and composed large varieties of estimators (ratio- or product- or 
regression-type). But, in our study we emphasize on the creation of estimators considering the two-phase ratio estimator 


tras the base. 


By convention, if z has a high positive correlation with x, the ratio estimator x; Z / Z, will estimate X more 


accurately than X,. Thus, replacing X, by X, Z / Z, in tp, Chand (1975) suggested a ratio-in-ratio estimator defined by 
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__ &Z 
Err = Voz 


zy 
where Z, = —Yies , Z;. On the contrary, if z has a high negative correlation with x, the product estimator x, Z,/ Z 
1 
will estimate X more accurately than x,. Accordingly, we may consider a product-in-ratio estimator of the form 


= mm 
tpr = J, zz 


Assuming that the regression line of x on z is linear without touching the origin, Kiregyera (1980) recommended 


the use of a regression estimatorx, — Dyacry(Z1 - Z) in place of x, and proposed a regression-in-ratio estimator given by 


_ [x - bray (Za —2)] 


tror = V2 > 
2 


Lies, i-X1) (Zj-Z1) « ‘ oF 
where byz(1) = SSS ae is the sample regression coefficient of x on z for s;. 


Asymptotic expressions for the MSEs of tgp, tpr and tpgp are as follows: 


—2 
M(trr) = M(ta) +Y 0,(C? — 2C,,) (2) 
—2 
M(tpp) = M(ta) + Y 6,(C? + 26,,) (3) 
—2 
M(tror) = M(t) + ¥ 01 ( 92,02 — 2pyePrzCyCy) (4) 


—2 —— 
where C2 = S?/Z , Cyz = Syz/Y Z, Pyz and px, are respectively correlation coefficients between y and z, x and 
1 2 1 — 
2, Se =— i Z) Sye= Gide — V4). 


Comparing MSEs, we see that tgp, tpp and tgp are likely to be more efficient than tp if 
C. C. 1 1. C¢ 
Pyz > 2 Pyzc < — 5 and Pyz > 2 Paz c, (5) 


respectively. These conditions therefore indicate that the strength and magnitude of the relationship between y 


and z also play an essential role in searching of different alternative estimators for the unknown mean X using z as an 
auxiliary variable. 
Instead of consideringx,, x, Z / Z,, %Z,/Z and xX, — Dyocay (Za - Z) as estimators of X, we may also consider 


more generally the difference estimator x, — d(Z, -Z 1 and accordingly interprete a generalized estimator for Y by 


= V2 te , 


e 


The estimators tg, tag, tpg and trgg run out as its special cases when d=0, X,/Z,,—X,/Z and Dyz(1) 


respectively. 
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3. TWO GENERALIZED ESTIMATORS UNDER A MODIFIED APPROACH 


If we analyze formulation techniques of different estimators included in the preceding review as well as some others 
available in the literature, we would like to remark that the estimators are simply recommended by the concerned authors. 
But no explanations have been given on the technique adopted for their construction [see for example Mukerjee ef al. 
(1987), Srivenkataramana and Tracy (1989), Srivastava et al. (1988, 1990)]. However, in this work it is desired to develop 
a general framework to address questions of how to effectively use the complete available auxiliary information on z at the 
estimation stage. The purpose is to gain better improvements over tgin respect of efficiency taking into account certain 
modifications over the approach of Chand (1975) and Kiregyera (1980), henceforth may be called as Chand-Kiregyera 


approach. 


An inspection of the compositions of tpg, tpg and trgp explained in the previous section shows that selections of 
Paws [ie Be Ze/. Z and Xx, — bye (21 - Z) over x, in the standard two-phase ratio estimator tp were just due to the fact 
that the former estimators are more efficient than the later one for estimating X under certain conditions. However, when 
the question of efficiency comes, we also believe that x, provides a less efficient estimate of X than X,. Hence, this school 
of thought also encourages for the selection of alternative estimators for %2 in term of the second covariate z. But, to 
generalize our estimation methodology we prefer to use difference estimators for which we have two options: use of either 
X2—n(Z —Z,) or X2 —- w(Z, - Z) in place of Xz. At the same time, we would also like to consider x, — d,(Z, - Z) or 
X,—d, (Z - Z) as alternatives tox,. These arrangements give rise to the following generalized ratio-type estimators: 
(@) = py Facds(Z1-2) 
1 2 X2—-n(@2-21)’ 
10) = x Fatal@s-2) 
2 2 X2-w(Z2-Z) 
The coefficients 7, d,,@ and d,appearing in the generalized estimators are either suitable picked out constants or 


random variables converging to some finite values. But in actual practice, the said coefficients are determined so as to 


control the mean square errors of the estimators. 


Our generalized estimators are very much flexible in the sense of being reduced to a large number of estimators 
using either one or two supplementary variables for suitable selections of the coefficients. Hence, they generate 
classes/families of estimators for Y. For the simplest case when 7 = w = 0 andd, = d, = d, this refers to the case of 
1 _ Lo 


using one auxiliary variable x and we have = t, the generalized estimator defined in section 2. On the other 


ng) = no 


hand, when 7 =w=0 and d, =d,=0, = tp, the two-phase classical ratio estimator that is our base 


f? and ae 


estimator. These results imply that the classes of estimators defined by are not necessarily disjoint. We also 


briefly present some specific cases of oe and oe for suitable selections of their coefficients and define the new 


estimators so produced as follows: 


Estimators Arising Out from Both ie and ig 
=7,8 <7, 0% 
ti=Y, ae ti2=Y, ae 
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Estimators Arising Out from ig 


Xy tay BE > aGye . os %1—Dx2(1)(Z1-Z) 
14 V25Gyp? 15 > lies 2 Fa—Diestayl@s-2i) 


tis =y 
13 = V2 Bye) Ga—2a)? 


_ Liesy %i-%2) @i-Z2) 
where Dyz(2) = Ties, 1a)? 


Estimators Arising Out from a 


X4Z2 Z s X4 > X1—-by2(ay(Z1-2Z) 
21 V2 XpZ’ 22 V2 XZ" 23 V2 %2—Dyz(2)(Z2-Z)” 24 V2 %2—Dyz(2)(Z2—Z) 


4. COMPARISON OF SOME SPECIFIC ESTIMATORS 


17 


In order to establish goodness of our modified approach over Chand-Kiregyera approach, we shall now make MSE 


comparisons between some specific estimators derived in section 3 and considered in section 2. But to make our 


comparative study manageable, we restrict ourselves to the situations in which three variables y, x and z are positively 


correlated. For this reason, we exclude tpp, ty2, ty5 and tz2 from the comparison, and considering structural resemblance 


we compare t11, ty4, C21 with tp, tpp and ty3, ty¢, to3,tz4 with tp, trp. 


Asymptotic expressions for the MSEs of the comparable estimators are presented below: 
M(tys) = M(ty) + ¥ (G2 - O,)(C2 + 2Cye — 2Cxe) 

M(ty3) = M(toa) = M(ta) — ¥ (82 — 1)(92.C2 — 2PyzPuzCyCx) 

M(tys) = M(tg) + ¥ [62(C2 + 2Cyz — 2Cyz) — 20;(2Cy, — Cyr) | 

M(tys) = M(tg) — ¥ (82 — 20,)( 92262 — 2pyePxoCyCe) 

M(ty) = M(ta) +¥ [02(C2 + 2C, — 2Cyz) + 20,Cyz] 


—2 
M(tz3) = M(te) —Y [62(p2,C? — 2pyzPxzCyC,) — 20,2,C2| 


(6) 


(7) 


(8) 


(9) 


(10) 


(11) 


In subsections 4.1 and 4.2, we expose certain sufficient conditions to show how situations do arise where the 


estimators coming out under modified approach perform better than their respective counterparts under Chand-Kiregyera 


approach. However, it may be specified here that extraction of necessary conditions are difficult. 
4.1. Comparison of t,1,¢,4 and t2, with tp and tpr 


From (1), (2) and (6), M(t,,) < M(t) if 
Cex 1 C. 
Puzo > at Pye 
and M(t,,) < M(tpp) if 


Cx es & 
Pro ze > ba (2+ Pye zt) 
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67-20, 
62-04" 


where k, = 


: : c : : 
Note that0 <k, <1 ifn,< “ and M(tpr) < M(tg) if Pya > -, Hence, we may conclude that if ter is 
superior to tg, then t,, is superior to both tp and tpg if 


Cx 
Puzo > 1, (14) 


provided nz < a 


The condition nz < a is a very mild restriction satisfied in many survey situations, and decided by the sampler at 


the planning stage without any appreciable increase in cost. 


Precisely in a similar way and omitting details of the derivations, we also deduce that when tgp is more precise 


than tp, then t,, is more precise than both tp and tpp if 


Cx 
Pxzc, > max(Ke, 1), 


Cx 
pg E> ke, (15) 


and t, is more precise than both tp and tgp if 


Cx 
Pxz Gr > max(k3,k4), 


xg E> ke, (16) 
6-0 r) 1 
where k, = Th (> 0 forn, < 1) ks = an (> 0) and k, = en (> 0). 


4.2. Comparison of t,3,t,¢ and t23 with tp and tr¢r 


From (1), (4), (7) and (9) we directly see that both t,3 and t;, would be more efficient than both tp and tpgp if 


1 C. 
<-p,, =. 17 
Pyz S 5 Pxz Gy (17) 


But, under this condition tp¢p is less efficient thantz. Hence, both t,3; andt,, are superior to tg and tp¢p just 


when tggp is inferior totp. In this sense the estimators t,3 and t;, may be considered as complementary to tpgp. 


Comparing M(t23) with M(tp) and M(tper), we have M(tz3) < M(tp) if 
1 C 
<-k =, 18 
Pyz 2 5Pxz Cy ( ) 
and M(t23) < M(tger) if 
1 Cy 
Pyz < 2 Ke Pxz C.” (19) 
y 


02-204 02-04 


62 +204 


(> 0). 


where k; = (> 0 forn, < 1) and kg = 


Combining preceding results, it should be clear that t23 would be more productive than tp and tggp when 
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Pye < jmin(ks, ke) Pee (20) 
provided nz < = 

To have an idea on a specific real life situation where the condition (20) is realized, consider for example 

N = 200,n, = 30 and n, = 12 which implies that k, = 0.282 and k, = 0.373 > ks. Then, py, < 0.141 Pre = 


would be sufficient for t23 to be more efficient than tp and tge¢p. 


In passing, we would like to remark that from the point of achievability and practicability, the conditions derived 


in this subsection are not as smooth as those derived in the preceding subsection. 


5. SOME DESIGN-BASED PROPERTIES OF t{® AND t{ 


We now move to study qualities of the generalized ratio estimators on the grounds of their design-based bias and MSE. 
Since the exact expressions for these measures under a finite population set-up are not easily derivable, we rely only on the 
approximate expressions derived using Taylor linearization method. We provide these results in the following sub-sections 


and mention our observations and remarks. 
5.1. Biases of - andt\” 
Derived asymptotic expressions for the biases of i” andt<” are 
B(t\) = B(t) + YD[(02 — 0,)n(nDC2 + Cyz — 2Cyz)—O14;(Cyz — Cxz)], (21) 


B(tS) = B(tp) + YD[(@, — 0,)w(wDC? + Cy, — 2Cyz)+O,(w — dz) (WDC? + Cy, — Cyz))], (22) 


where D = Z/X and B(tg) is the asymptotic expression for the bias of tz given by 
B(ta) = ¥(@2 — 8,)(CZ — Cyx). (23) 


As is already known, B(tp) = 0i.e., tg is approximately unbiased when 


C2 - Cyy = 0 
= Byx =R, (24) 


which means that the regression line of y on x is linear passing through the origin, where f,, = Sy./ S2. Since the 


expressions (21) and (22) are not simple, it is not so easy to draw a similar conclusion on the biases of i? andt<, 


However, we see that their biases are small for larger samples. In the following, we shall just derive some sufficient 
conditions for which Ba”) = Bie”) = 0 assuming that B(tg) = Oie., under the fulfillment of the restriction (24). 


Because, here our objective is to achieve improvements over tz in certain sense. 
From (21), Bi) = 0 when B(tg) = 0, either n = 0 or nDC? + Cy, — 2C,, = 0, and either dy = 0 or 
Cyz — Cyz = 0. But, we cannot consider both 7 and d, as zero for which ti is not defined. Further, 


nDC? + Cyz — 2C,, = 0 and Cy, — Cy, = 0 
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_ 2Cxz—Cyz 


=> a subject to Cy, —C,, =0 
Cen 
= N= oe2 be = Brz- (25) 


Hence, B(n”) = 0 when n = f,, subject to the conditions that d, # 0 and By, = R. 
To derive some impressive results from (22), let us rewrite the equation in the following alternative form: 
B(t8) = B(tp) + YD[0,w(wDC? + Cy, — 2Cyz)—O, {wd DC? + daly, — (w + da)Cyz}]. (26) 


Assuming that w # 0 and equating second term in the right side of (26) to zero, we have 


__ 2Cyz—Cyz 
a a (27) 


On the other hand, equating third term in the right side of (26) to zero subject to (27), we also find after 


considerable simplification that 


_ 2Cxz—Cyz 


d, = (28) 


Dc? 


Therefore, if (25) is satisfied B(tS) = 0 when 


dy = w = 94 = 2p, — 22, (29) 


Dc? R 


where By, = Syz/S? and Byz = Syz/S?. But, for By, = R, we have 
dg=w =22,,-—. (30) 
In view of the preceding results, we have the following conclusions: 


Under the assumption that tp is asymptotically unbiased i.e., By, = R, i? is asymptotically unbiased if n = Byz 


and d, # 0, andt is asymptotically unbiased if dy = w = 2B,, — a, 
yx 


5.2. Mean Square Errors of and a 
Asymptotic expressions for the MSEs of the generalized ratio estimators are obtained as 
= 
M(t) = M(tg) + D[(. — 0,)n(MDC? + 2C,, — 2C,,)+0,d,(d,DC?2 — 2C,,)], G1) 


—2 
M(t) = M(tg) +¥ D[(6, — 0,)w(wDC? + 2Cyz — 2Cyz) +0, {(w — d2)?DC?2 + 2(w — dz)Cyz}]. (32) 
These expressions tentatively decide possible ranges or intervals for the coefficients so that the generalized 


estimators would be better thant. 
From (31) we note that M(t) < M(tp) i.e. at would be more efficient than tg when 
Q, = n(nDC? + 2C,, — 2C,z) < 0 (33) 


and 
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QO, =0,(0, DC? =26,,) <0. (34) 
These conditions hold iff the roots of the quadratic equations Q; = 0 in 7 and Q, = 0 ind, are real and distinct, 
and 7 andd, lie between them. This leads to the restrictions 


Byz Byz 


0<n<2(6,,-) and 0<d, <2 (35) 
or 
Byz Byz 
2 (Bez 2) Sy <0 and 222 <d, <0, (36) 
: Byz Byz 
according as 0 < = Bez OF Byz < oo 0. 
Combining ranges for 7 and d, given in (35) and (36), we further have 
0<n+d, < 28,, (37) 
and 
2Byz <ntd,<0 (38) 


if0< fv < By, and By, < fon < 0 respectively. 


The derived ranges for 7 and d, in (35), (36), (37) and (38) provide certain guidelines to select their values in 


order to improve accuracy of at compared to tp. 


From (32), would be superior tot, if 


Q3 = w(wDC? + 2Cy, — 2C,,) < 0 (39) 
and 
Qy = (w — d2)*DC? + 2(w — dz)Cy, < 0. (40) 


The equation (39) directly implies that 


0< 0 <2(B.-™) (41) 
and 
2 (Br — 4) sw <0 (42) 


for0< nit < Py, and By, < me < 0 respectively. 


Further, from (40) we have 
2 Byz < ( d 
-282 <(w-d,) <0 (43) 


for By, > 0, and 
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0<(w-d,) <-2% (44) 


for By, < 0. 


Now we see that a plausible value of w can be determined from (41) or (42) directly whereas such a value for dz 
cannot be determined independently from (43) or (44). But, after deciding w in light of (41) or (42),dz can be decided 
using (43) or (44). 


The upper and lower limits of the ranges for 7, w, d, and dz calculated here exclusively depend on Py, 6, and R. 
These ranges would be competent enough to provide suitable values of the coefficients so as to make the proposed 
generalized ratio estimators more efficient than the classical ratio estimator. Sometimes this of course may not be feasible 
in the absence of known values of the said parameters. However, prior knowledge from the past data or surveys or 
experience or even guessed values having close approximations to the true values may be very much helpful for this 


purpose. 
6. DETERMINATION OF OPTIMAL COEFFICIENTS FOR ¢{® AND tS” 


As said earlier, proper selections of the coefficients make the proposed generalized estimators more effective and 
practicable. Towards a solution of this problem, our discussions in the preceding section is helpful to some extent by 
constructing appropriate ranges in terms of certain population parameters. But, here we would like to obtain the best values 


i.e., optimal values of the coefficients d,, 7, dz and w which minimize variances of the concerned estimators. 


For minimizing M (6), we differentiate the equation (31) w.r.t. 7 and d,, and equate the resulting equations to 


zero to obtain the following normal equations: 


am(t{) =i 
on 7 
> nDC2 + Cyz — Cyz = 0,~ (45) 
am(t{%) = 
ad, 
> d,DC? —C,, = 0, (46) 


a Byz 

ft = Be — 2 (47) 
and 

a Byz 

a, = be (48) 


After making use of these optimal values of the coefficients, we find the minimum value of M Cu as 


—2 Cx 2 
Myin(t) = Ma) -¥°C3 le. - 6:) (by, Pxz) + 065 (49) 
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This minimum MSE may be designated as the minimum MSE bound of ee An estimator whose MSE equals to 
Mranlt!”) is termed as a minimum MSE bound estimator that can be obtained after substituting optimal values # and d, 


in i Here, this estimator turns out to be the following ratio-type estimator: 


Differentiating M Cc) in (32) with respect to w and d, partially, we also derive the following two normal 


equations: 
(G) 
dail ) ai 
> 0,(wDC? + Cy, — Cyz) + 0,(—d2DC? + Cy) = 0. (50) 
am (es) _ 
adz 
> (w — d)DC? + Cy, = 0. (51) 


One can easily note that the normal equations (50) and (51) form a system of simultaneous equations. For this 
reason, the optimum values of w and d, cannot be found out uniquely as they depend on each other in each of the 


equations. Denoting @ and d, as the optimum values of w and d,, from (51) we now have 


d, =@ +P. (52) 


Using (52), from (50) we finally obtain 


O = Bz —™, (53) 
and 
dy = Bre. (54) 


The foregoing analysis shows that the minimum MSE bound and the corresponding MSE bound estimator of ae 


are respectively 


2 Cy 2 
Myin (C5) = M (te) -¥°G} le. — 61) (bye Zhez) + 003} (55) 
and 
= X1- Buz (Z1-Z 
ait =7, %1- Bxz(Z1-Z) 


%2— (Bez - 2)(Z2-2) 
Here the important point to note is that Mmin(te?) = Menlts”), which implies that the minimum MSE bounds 


of a and are the same. 
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As in the case of range determination, here the coefficients d,,7,dz andw also require known values of the 


parametersf,,,P,, and R otherwise the optimum estimatorst(?? and ie cannot be computed from the survey data. But in 
most of the occasions the parameters remain as unknown quantities and the usual practice is to estimate those using data 


available from the second-phase samples. 


_ Lies (Vi- Vn) (Zi-Za) _ Miesg(*i-%2) (Zi-Z2) 


re Pyzca) ~ . Paz) ~ Lies, (2i—-Z2)? 


V2 F ; : 
= and r, = + respectively be the consistent estimators of 
Lies, (Zi-Z2)? 2 x2 P y 


Byz, Bxz and R based on sz. Then for computational purposes the optimum estimators shall be defined in following 


manner: 


b 7 
*1- 274-7) 


(G og MG - X1-b (Z1-Z) 
© = ¥,- and i =y,- see 21M 


b b s 
X2- (bx2@) = ¥22)) Z,—74) x2- (bx212) = 222)) (7-7) 


T2 T2 


It is important to understand here that the use of sample estimates in places of the respective unknown parameters 


does not make any change in the asymptotic MSE expressions of the resulting estimators 7.e., 
G G a(G a(G 
Mmnin(th ) = Mmnin(ts ) = M(ER) = M(ESp). 


To sum up our preceding theoretical results, once again we would like to remark that the classes of estimators 


i ae although have the same minimum MSE bounds, their MSE bound estimators are different. 


constructed by and 


7. COMPARISON OF ¢{® AND t WITH t 
As is said earlier, the estimator t defined in section 2 under the Chand-Kiregyera approach, produces a system of 
estimators covering tp,tppr,tpr and trgp as its potential members. This system also remains as a subclass of estimators of 
the wider classes of estimators coming out oft andt<” for 7 = w =0 andd, = d, =d. But to confirm that our 
formulated modified technique is better than the Chand-Kiregyera method, we need a comparison of ae and a with t at 
least in respect of bias and MSE. 

Considering 7 = w = 0 and substituting d, = d, = d, asymptotic expressions for the bias and MSE of t can be 


directly obtained from those expressions for ge or i. Hence we have 


B(t) = Blt) — YdDO,(Cyz — Cuz) (56) 
and 
M(t) = M(tg) +Y dDO,(dDC? — 2C,z). (57) 


See that unlike oh andts, B(t) does not depend on the coefficient d and B(t) = 0, if 
B(tg) =0 and Cy, —C,, =0 


Byz Byz 


= By, =R and py, = aa Bon’ (58) 
yx 
This mean that under the situations where tp is asymptotically unbiased, t is equivalently unbiased if B,, = ou 
yx 


Of course this restriction appears to be more severe than the similar restriction for i” but equally stringent to that of , 
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From (57), M(t) < M(tp) if either 
d<0 and dDC? -2C,, >0 
or 
d>0 and dDC? -2C,, <0. 


These conditions further imply that t is more precise than tp when either 


gt gd <0 (59) 
R 

or 

0o<d<2% (60) 


R 


according as By, > 0 or By, < 0. These conditions are equivalent to the second conditions of (35) and (36). 


From (31) and (57) 
—2 
M(t) — M(t) = -¥" D[(6, — 0,)n(MDC2 + 2Cy, — 2Cyz) +04 (dy — d){(d, + d)DC2 — 2C,,}]. (61) 
Hence, i would be more efficient than t if 
Byz 
n> 2 (Baz — 27) (62) 
and 
Byz 
ae aaa (63) 


But if d, = d, (62) is the only sufficient condition to make tf” more effective than t. 


From (32) and (57) we also have 
M(t) — M(t) = -¥" D[(@, — 0,)@(wDC2 + 2Cyz — 2C xz) 
+0,((w — d,) — d){((w — d,) + d)DC? — 26,,}]. (64) 


Here we see that M(t) < M (eo), when 


ce 
W > 2 (Bre — 2) (65) 
and 

w —d, > 2% —d. (66) 


After fixing the value of w according to (65), a choice of dz is made according to (66). On the other hand, 
ifd, = d then 


w > max [2 (Br. — 2) 22] (67) 
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is the sufficient condition so that ae would be more efficient than t. 

It may be remarked here that the above derived sufficient conditions favoring a and ae of course difficult to 
check in many occasions. But they clearly indicate that there is scope for improving upon the formulated estimation 
strategy over that considered in Chand (1975) and Kiregyera (1980). However, in the following discussion it has been 
shown that for optimum choices of the coefficients, the former strategy always yields higher efficiency gain over the later 


one. 
7.1. Efficiency Comparison for Optimal Coefficients 


From (58), the optimal value of d that minimizes M(t) is 
a=, 
and the resulting minimum MSE bound and the minimum MSE bound estimators are respectively 
Myin(t) = M (ty) ~ ¥ 0,C303z (68) 


and 


= _fyz/. — _ Pyz@)ro 5 
© ay ROT | pe oy Fe Or 


X2 X2 
if B,, and R are estimated. 


Now we see that Moalte) = Muggle?) < Mmin(t). This shows that t is less efficient than both i and a 


in respect on minimum MSE bound criterion. 


8. SOME REMARKS ON THE EFFICIENCIES OF@, 2&9 AND #50 


It has already been shown earlier that the four estimators viz., tp, trp, tpp and tpg, and the series of estimators t11, ty2, 
ty3, tya, tis, ti6, t21, tz2, te3 and tz, those considered in section 3 are some specific cases of either io or ie or both. 
These estimators are therefore less efficient than their minimum MSE bound estimators i andé<*?, On the same 


ground,tp, trp, tpr and trgp being particular cases of t are always less efficient than 7@), 


Further from (49), (55) and (68) note that 


Malte) = Moiglts) < Min (t). 


This leads to a conclusion that that £ is definitely inferior to both and ee 


9. EMPIRICAL STUDY 


In the previous sections while studying precision of one estimator over others, we derived various sufficient conditions in 
terms of certain parametric functions. But some of the derived conditions are so complicated to be checked in a specific 
situation of practical interest. This makes the job of identifying a better estimator among others a difficult one. Hence, we 


need an analysis of the performance of different estimators quantitatively. For this here we do carry out an empirical study 


using data of 12 natural populations as described in table 1. This will not only help to evaluate the gain in efficiency of ae 
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or ony over other estimators but also to identify a better estimator among others easier. But, as in section 4, to make the 
study controllable we deal with py, >0, py, >0 andp,, > 0. Hence, the estimators under consideration areta,trr, 


tror tii, tia, tras tre, tor, t23 and t24. But, keeping in mind M(t,4) = M(t,3) and M(é??) = M(é??), we quote results 


for ty3 andé<®), 


To examine relative performance of the selected estimators ofY, we have computed their percentage relative 


=2 
efficiencies (PREs) compared to the conventional estimator y, with V(y2) =Y 6, Cae Allowing SRSWOR scheme at each 


phase, computed values of the PREs of the comparable estimators for different values of n, and nz meeting the 
restrictionn, < a are displayed in table 2. 


Table 1: Description of the Populations 


Pop.No. Source N y ae V4 
1 ao B4villages areaunderwheatin 1964 freaunderwheatin 1963 _cultivatedareain 1961 
2 Perry(2007) 8011 households poussholineceisposalmcnm 4 cuschaldsonsanntol es 
ke householdincome-earners 
3 ee B4villages reaunderwheatin 1937 er iiaiaii totalcultivatedarea in193 1 
4 co 25brothers headlengthof secondson _fheadlengthoffirst son headbreadthoffirst son 
5 Sukhatmeand 20trees bushelsofapples appletreesof bushelsofapples 
(Chand(1977) harvestedin1964 bearingagein1964 harvestedin 1959 
Fe er ere ee number of paralytic polio 
6 — Cochran(1977),p.181 B4countries number of placebo children ae cases in the not inoculated 
cases in the placebo group beans 
7 Shukla(1966) SOplants ffiberyield/plant plantgreenweight basediameter 
8  Srivastava(1971) SOplants vield/plant heightoftheplant basediameter 
9 Tripathi(1980) 22S5households personsinservice educatedpersons sizeofhouseholds 
10 ae ns 80 factories loutput no.ofworkers fixedcapital 
; SOlrisflowers ; 
11 Fisher(1936) versicolor) kepalwidth sepallength petallength 
: SOlrisflowers . . 
12 Fisher(1936) (virginica) petalwidth sepalwidth petallength 


After careful examination of the tabulated values on the PREs, we now summarize our numerical findings in the 


following manner: 


° As the theory asserts, £¢ 


G) 


populations taken into consideration. 


attains the maximum precision amongst all with appreciable efficiency gain for all 


e Both tgg and tgp are more efficient than tp. But, as is expected, performance of @ is better than tp, tar and 


trgr in all cases. 
e The estimators t,,,t,, and tz, are more preferable to tgp but tz, appears to be less preferable to both t,,and ty,. 


e Amongst t,3, ty¢ and tz3, t23 comes out as the worst one. Although for all populations t,3 and t,, are superior to 
trcor, t23 is the same for 8 populations (except first 4). This negative result for t23 is due to nonfulfillment of its 


favorable conditions. 
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Table 2: PREs of Different Estimators 


Samplesizes Estimators 
oN ny, | Ny tr | trr | ti | tia | tor | tror| tis | tie | tog | te 
1 15 7 156.9 | 256.5 | 411.3 | 354.2 | 266.3 | 408.6 | 515.3 | 638.5 | 341.9 | 618.7 | 819.0 
2 600 250 147.5 | 165.3 | 180.5 | 190.3 | 167.5 | 150.2 | 151.7 | 153.1 | 149.0 | 210.5 | 225.8 
3 12 5 147.7 | 466.9 | 503.8 | 516.1 | 488.3 | 573.9 | 582.3 | 575.2 | 441.8 | 578.6 | 584.9 
4 10 4 130.2 | 178.8 | 185.1 | 203.5 | 182.3 | 185.1 | 197.9 | 187.1 | 178.0 | 190.2 | 202.5 
5 50 20 256.4 | 409.2 | 499.9 | 523.9 | 467.3 | 483.5 | 516.1 | 520.5 | 489.6 | 519.5 | 529.3 
6 13 6 112.8 | 135.8 | 186.3 | 199.1 | 136.9 | 153.6 | 197.2 | 226.0 | 157.0 | 208.0 | 259.2 
7 25 8 184.6 | 186.6 | 199.6 | 202.4 | 186.9 | 175.6 | 192.7 | 189.4 | 179.7 | 193.6 | 215.3 
8 20 7 143.3 | 164.4 | 184.2 | 191.2 | 169.9 | 148.5 | 190.0 | 196.3 | 150.8 | 183.8 | 235.9 
9 50 20 116.5 | 126.2 | 127.6 | 129.9 | 126.8 | 130.5 | 137.3 | 137.9 | 135.3 | 139.6 | 140.3 
10 30 10 172.5 | 246.1 | 251.2 | 286.4 | 248.2 | 264.6 | 290.6 | 278.4 | 273.2 | 296.4 | 315.6 
ll 20 8 115.9 | 123.5 | 169.6 | 172.3 | 145.4 | 134.2 | 194.4 | 206.2 | 170.1 | 208.2 | 238.6 
12 18 7 107.0 | 114.1 | 182.7 | 181.6 | 164.4 | 122.5 | 208.3 | 212.7 | 197.3 | 210.3 | 222.7 


After further scrutinizing the empirical findings, it may be finally concluded that the proposed estimation method 


in terms of oa and t can be gainfully employed in many survey situations. 
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